Search CORE

200 research outputs found

CompaGB: An open framework for genome browsers comparison

Author: J Soh
N Sato
J Yang
R Ghai
M Friedel
L Stein
TS Furey
JD Gans
P Schattner
M Hoebeke
LD Stein
WJ Kent
TJ Hubbard
T Carver
ME Skinner
TA Down
Publication venue: BioMed Central
Publication date: 01/01/1992
Field of study

Abstract Background Tools to visualize and explore genomes hold a central place in genomics and the diversity of genome browsers has increased dramatically over the last few years. It often turns out to be a daunting task to compare and choose a well-adapted genome browser, as multidisciplinary knowledge is required to carry out this task and the number of tools, functionalities and features are overwhelming. Findings To assist in this task, we propose a community-based framework based on two cornerstones: (i) the implementation of industry promoted software qualification method (QSOS) adapted for genome browser evaluations, and (ii) a web resource providing numerous facilities either for visualizing comparisons or performing new evaluations. We formulated 60 criteria specifically for genome browsers, and incorporated another 65 directly from QSOS's generic section. Those criteria aim to answer versatile needs, ranging from a biologist whose interest primarily lies into user-friendly and informative functionalities, a bioinformatician who wants to integrate the genome browser into a wider framework, or a computer scientist who might choose a software according to more technical features. We developed a dedicated web application to enrich the existing QSOS functionalities (weighting of criteria, user profile) with features of interest to a community-based framework: easy management of evolving data, user comments... Conclusions The framework is available at <url>http://genome.jouy.inra.fr/CompaGB</url>. It is open to anyone who wishes to participate in the evaluations. It helps the scientific community to (1) choose a genome browser that would better fit their particular project, (2) visualize features comparatively with easily accessible formats, such as tables or radar plots and (3) perform their own evaluation against the defined criteria. To illustrate the CompaGB functionalities, we have evaluated seven genome browsers according to the implemented methodology. A summary of the features of the compared genome browsers is presented and discussed.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

HAL Descartes

ProdInra

Online Research Database In Technology

Identification of disease-causing genes using microarray data mining and gene ontology

Author: A Mohammadi
A Zhang
AA Alizadeh
Azadeh Mohammadi
B Duval
BF Souza
C Ambroise
C Ding
C Tago
D Lin
D Singh
E Martinez
FM Couto
I Guyon
I Inza
J Jaeger
JJ Jiang
L Li
L Yu
L Ziaei
Mansoor Salehi
Mohammad H Saraee
N Cristianini
P Pavlidis
P Resnik
PA Mundra
PA Mundra
PJ Park
R Genuer
RF Weaver
S Li
S Li
TM Huang
TR Golub
TS Furey
U Alon
W Xu
Y Ding
Y Saeys
Y Wang
YL Chin
Z Xie
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Background: One of the best and most accurate methods for identifying disease-causing genes is monitoring gene expression values in different samples using microarray technology. One of the shortcomings of microarray data is that they provide a small quantity of samples with respect to the number of genes. This problem reduces the classification accuracy of the methods, so gene selection is essential to improve the predictive accuracy and to identify potential marker genes for a disease. Among numerous existing methods for gene selection, support vector machine-based recursive feature elimination (SVMRFE) has become one of the leading methods, but its performance can be reduced because of the small sample size, noisy data and the fact that the method does not remove redundant genes. Methods: We propose a novel framework for gene selection which uses the advantageous features of conventional methods and addresses their weaknesses. In fact, we have combined the Fisher method and SVMRFE to utilize the advantages of a filtering method as well as an embedded method. Furthermore, we have added a redundancy reduction stage to address the weakness of the Fisher method and SVMRFE. In addition to gene expression values, the proposed method uses Gene Ontology which is a reliable source of information on genes. The use of Gene Ontology can compensate, in part, for the limitations of microarrays, such as having a small number of samples and erroneous measurement results. Results: The proposed method has been applied to colon, Diffuse Large B-Cell Lymphoma (DLBCL) and prostate cancer datasets. The empirical results show that our method has improved classification performance in terms of accuracy, sensitivity and specificity. In addition, the study of the molecular function of selected genes strengthened the hypothesis that these genes are involved in the process of cancer growth. Conclusions: The proposed method addresses the weakness of conventional methods by adding a redundancy reduction stage and utilizing Gene Ontology information. It predicts marker genes for colon, DLBCL and prostate cancer with a high accuracy. The predictions made in this study can serve as a list of candidates for subsequent wet-lab verification and might help in the search for a cure for cancers

University of Salford Institutional Repository

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Notes on wormhole existence in scalar-tensor and F(R) gravity

Author: A. A. Starobinsky
A. A. Starobinsky
A. B. Balakin
A. Bhattacharia
A. G. Agnese
A. V. B. Arellano
C. H. Brans
D. H. Coule
D. Hochberg
D. Hochberg
F. S. N. Lobo
F. S. N. Lobo
H. G. Ellis
H. Nariai
I. Quiros
I. Z. Fisher
K. A. Bronnikov
K. A. Bronnikov
K. A. Bronnikov
K. A. Bronnikov
K. A. Bronnikov
K. A. Bronnikov
K. A. Bronnikov
K. A. Bronnikov
K. A. Bronnikov
K. A. Bronnikov
K. A. Bronnikov
K. A. Bronnikov
K. A. Bronnikov
K. A. Bronnikov
M. V. Skvortsova
N. Furey
O. Bergmann
S. A. Appleby
S. A. Hayward
T. P. Sotiriou
V. Gorini
V. Gorini
V. Ts. Gurovich
Publication venue: 'Pleiades Publishing Ltd'
Publication date: 08/06/2010
Field of study

Some recent papers have claimed the existence of static, spherically symmetric wormhole solutions to gravitational field equations in the absence of ghost (or phantom) degrees of freedom. We show that in some such cases the solutions in question are actually not of wormhole nature while in cases where a wormhole is obtained, the effective gravitational constant G_eff is negative in some region of space, i.e., the graviton becomes a ghost. In particular, it is confirmed that there are no vacuum wormhole solutions of the Brans-Dicke theory with zero potential and the coupling constant \omega > -3/2, except for the case \omega = 0; in the latter case, G_eff < 0 in the region beyond the throat. The same is true for wormhole solutions of F(R) gravity: special wormhole solutions are only possible if F(R) contains an extremum at which G_eff changes its sign.Comment: 7 two-column pages, no figures, to appear in Grav. Cosmol. A misprint corrected, references update

arXiv.org e-Print Archive

Crossref

National Open Repository Aggregator (NORA)

Psoriasis prediction from genome-wide SNP profiles

Author: A Tsalenko
B Goertzel
B Zhang
BH Cho
D Brinza
F Zhou
I Guyon
M Gormley
MH Gail
Momiao Xiong
N Slonim
N Slonim
N Zhou
Q Lu
RB D'Agostino
RP Nair
Shenying Fang
TM Cover
TS Furey
Xiangzhong Fang
XL Wang
YH Wang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background With the availability of large-scale genome-wide association study (GWAS) data, choosing an optimal set of SNPs for disease susceptibility prediction is a challenging task. This study aimed to use single nucleotide polymorphisms (SNPs) to predict psoriasis from searching GWAS data. Methods Totally we had 2,798 samples and 451,724 SNPs. Process for searching a set of SNPs to predict susceptibility for psoriasis consisted of two steps. The first one was to search top 1,000 SNPs with high accuracy for prediction of psoriasis from GWAS dataset. The second one was to search for an optimal SNP subset for predicting psoriasis. The sequential information bottleneck (sIB) method was compared with classical linear discriminant analysis(LDA) for classification performance. Results The best test harmonic mean of sensitivity and specificity for predicting psoriasis by sIB was 0.674(95% CI: 0.650-0.698), while only 0.520(95% CI: 0.472-0.524) was reported for predicting disease by LDA. Our results indicate that the new classifier sIB performs better than LDA in the study. Conclusions The fact that a small set of SNPs can predict disease status with average accuracy of 68% makes it possible to use SNP data for psoriasis prediction.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The radial arrangement of the human chromosome 7 in the lymphocyte cell nucleus is associated with chromosomal band gene density

Author: A Bolzer
A Ono
AM Boutanaev
B Dutrillaux
BA Boggs
C Federico
C Federico
C Federico
C Federico
C Federico
Catia Daniela Cantarella
CM Clemson
Concetta Federico
D Zink
E Lukasova
ED Andrulis
EV Volpi
G Bernardi
G D’Onofrio
H Tanabe
H Tanabe
HA Foster
I Solovei
IHGSC (International Human Genome Sequencing Consortium)
J Ferreira
J Strouboulis
J Zhou
JA Croft
JJ Roix
JM Bridger
K Kupper
KE Brown
L Andreozzi
M Cockell
M Costantini
M Neusser
N Gilbert
N Sadoni
NV Petrova
Patrizia Di Mare
PS Masny
S Boyle
S D’Antoni
S Saccone
S Saccone
S Saccone
S Saccone
Sabrina Tosi
Salvatore Saccone
SW Scherer
T Cremer
TS Furey
U Francke
WA Bickmore
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/04/2008
Field of study

This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ Springer-Verlag 2008.In the nuclei of human lymphocytes, chromosome territories are distributed according to the average gene density of each chromosome. However, chromosomes are very heterogeneous in size and base composition, and can contain both very gene-dense and very gene-poor regions. Thus, a precise analysis of chromosome organisation in the nuclei should consider also the distribution of DNA belonging to the chromosomal bands in each chromosome. To improve our understanding of the chromatin organisation, we localised chromosome 7 DNA regions, endowed with different gene densities, in the nuclei of human lymphocytes. Our results showed that this chromosome in cell nuclei is arranged radially with the gene-dense/GC-richest regions exposed towards the nuclear interior and the gene-poorest/GC-poorest ones located at the nuclear periphery. Moreover, we found that chromatin fibres from the 7p22.3 and the 7q22.1 bands are not confined to the territory of the bulk of this chromosome, protruding towards the inner part of the nucleus. Overall, our work demonstrates the radial arrangement of the territory of chromosome 7 in the lymphocyte nucleus and confirms that human genes occupy specific radial positions, presumably to enhance intra- and inter-chromosomal interaction among loci displaying a similar expression pattern, and/or similar replication timing

Crossref

Brunel University Research Archive

Defining genes: a computational framework

Author: BO Palsson
Christian V. Forst
CV Forst
D Karolchik
David C. Krakauer
E Dicou
E Pennisi
G Berry
H Pearson
I Brigandt
JD Walton
K Scherrer
L Duret
MB Gerstein
MD Laubichler
MM Krem
Peter F. Stadler
RG Taylor
S Griffiths-Jones
SJ Prohaska
Sonja J. Prohaska
TR Gingeras
TS Furey
Y Tohsato
Publication venue: Springer-Verlag
Publication date: 01/01/2009
Field of study

The precise elucidation of the gene concept has become the subject of intense discussion in light of results from several, large high-throughput surveys of transcriptomes and proteomes. In previous work, we proposed an approach for constructing gene concepts that combines genomic heritability with elements of function. Here, we introduce a definition of the gene within a computational framework of cellular interactions. The definition seeks to satisfy the practical requirements imposed by annotation, capture logical aspects of regulation, and encompass the evolutionary property of homology

Crossref

Springer - Publisher Connector

Fraunhofer-ePrints

PubMed Central

A comparative study on gene-set analysis methods for assessing differential expression associated with the survival phenotype

Author: A Rosenwald
A Subramanian
AA Alizadeh
AJ Adewale
AL Boulesteix
AP Crijns
E Bair
H Binder
HK Dressman
I Dinu
J Gui
Jinheum Kim
JJ Goeman
JJ Goeman
JJ Goeman
K Jung
L Tian
Q Liu
R Tibshirani
Seungyeoun Lee
Sunho Lee
SY Kim
TR Golub
TS Furey
VK Mootha
X Chen
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Many gene-set analysis methods have been previously proposed and compared through simulation studies and analysis of real datasets for binary phenotypes. We focused on the survival phenotype and compared the performances of Gene Set Enrichment Analysis (GSEA), Global Test (GT), Wald-type Test (WT) and Global Boost Test (GBST) methods in a simulation study and on two ovarian cancer data sets. We considered two versions of GSEA by allowing different weights: GSEA1 uses equal weights, yielding results similar to the Kolmogorov-Smirnov test; while GSEA2's weights are based on the correlation between genes and the phenotype. Results We compared GSEA1, GSEA2, GT, WT and GBST in a simulation study with various settings for the correlation structure of the genes and the association parameter between the survival outcome and the genes. Simulation results indicated that GT, WT and GBST consistently have higher power than GSEA1 and GSEA2 across all scenarios. However, the power of the five tests depends on the combination of correlation structure and association parameter. For the ovarian cancer data set, using the FDR threshold of q Conclusion Simulation studies and a real data example indicate that GT, WT and GBST tend to have high power, whereas GSEA1 and GSEA2 have lower power. We also found that the power of the five tests is much higher when genes are correlated than when genes are independent, when survival is positively associated with genes. It seems that there is a synergistic effect in detecting significant gene sets when significant genes have within-class correlation and the association between survival and genes is positive or negative (i.e., one-direction correlation).</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ANMM4CBR: a case-based reasoning method for gene expression data classification

Author: A Aamodt
Bangpeng Yao
C Ding
D Berrar
F Díaz
H Li
I Jurisica
J Khan
J Kolodner
J Ye
JY Koo
K Fukunaga
M Bressan
M Dettling
MB Eisen
N Arshadi
OG Troyanskaya
OG Troyanskaya
PJ Park
R Bouckaert
RA Heller
S Dudoit
S Ramaswamy
SC Johnson
Shao Li
TR Golub
TS Furey
U Alon
W Pan
Y Freund
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Accurate classification of microarray data is critical for successful clinical diagnosis and treatment. The "curse of dimensionality" problem and noise in the data, however, undermines the performance of many algorithms. Method In order to obtain a robust classifier, a novel Additive Nonparametric Margin Maximum for Case-Based Reasoning (ANMM4CBR) method is proposed in this article. ANMM4CBR employs a case-based reasoning (CBR) method for classification. CBR is a suitable paradigm for microarray analysis, where the rules that define the domain knowledge are difficult to obtain because usually only a small number of training samples are available. Moreover, in order to select the most informative genes, we propose to perform feature selection via additively optimizing a nonparametric margin maximum criterion, which is defined based on gene pre-selection and sample clustering. Our feature selection method is very robust to noise in the data. Results The effectiveness of our method is demonstrated on both simulated and real data sets. We show that the ANMM4CBR method performs better than some state-of-the-art methods such as support vector machine (SVM) and <it>k </it>nearest neighbor (<it>k</it>NN), especially when the data contains a high level of noise. Availability The source code is attached as an additional file of this paper.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Top scoring pairs for feature selection in machine learning and applications to cancer outcome prediction

Author: A Statnikov
AC Tan
C Bishop
C Lai
D Geman
DG Beer
I Guyon
I Inza
J Jin
J Weston
LJ van 't Veer
Mark A Kon
MH Asyali
P Baldi
Ping Shi
Qifu Zhu
R Blanco
R Kohavi
S Hanshall
S Ma
S Yoon
SL Pomeroy
Surajit Ray
TM Cover
TR Golub
TS Furey
V Vinaya
VN Vapnik
X Zhang
Y Saeys
Y Wang
Y Wang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background The widely used k top scoring pair (k-TSP) algorithm is a simple yet powerful parameter-free classifier. It owes its success in many cancer microarray datasets to an effective feature selection algorithm that is based on relative expression ordering of gene pairs. However, its general robustness does not extend to some difficult datasets, such as those involving cancer outcome prediction, which may be due to the relatively simple voting scheme used by the classifier. We believe that the performance can be enhanced by separating its effective feature selection component and combining it with a powerful classifier such as the support vector machine (SVM). More generally the top scoring pairs generated by the k-TSP ranking algorithm can be used as a dimensionally reduced subspace for other machine learning classifiers. Results We developed an approach integrating the k-TSP ranking algorithm (TSP) with other machine learning methods, allowing combination of the computationally efficient, multivariate feature ranking of k-TSP with multivariate classifiers such as SVM. We evaluated this hybrid scheme (k-TSP+SVM) in a range of simulated datasets with known data structures. As compared with other feature selection methods, such as a univariate method similar to Fisher's discriminant criterion (Fisher), or a recursive feature elimination embedded in SVM (RFE), TSP is increasingly more effective than the other two methods as the informative genes become progressively more correlated, which is demonstrated both in terms of the classification performance and the ability to recover true informative genes. We also applied this hybrid scheme to four cancer prognosis datasets, in which k-TSP+SVM outperforms k-TSP classifier in all datasets, and achieves either comparable or superior performance to that using SVM alone. In concurrence with what is observed in simulation, TSP appears to be a better feature selector than Fisher and RFE in some of the cancer datasets. Conclusions The k-TSP ranking algorithm can be used as a computationally efficient, multivariate filter method for feature selection in machine learning. SVM in combination with k-TSP ranking algorithm outperforms k-TSP and SVM alone in simulated datasets and in some cancer prognosis datasets. Simulation studies suggest that as a feature selector, it is better tuned to certain data characteristics, i.e. correlations among informative genes, which is potentially interesting as an alternative feature ranking method in pathway analysis

CiteSeerX

Crossref

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Enlighten

Detection of regulator genes and eQTLs in gene networks

Author: A Butte
A Chatr-Aryamontri
A Clauset
A Joshi
A Joshi
A Kundaje
AA Shabalin
AJ Enright
AJ Walhout
AS Dimas
B Schwanhausser
B Zhang
B Zhang
C Cenik
CO Daub
D Koller
DA Cusanovich
DM Greenawalt
E Bonnet
E Ravasz
E Segal
EC Neto
EC Neto
EC Neto
EE Schadt
EE Schadt
EE Schadt
EE Schadt
EE Schadt
EJ Foss
F Grubert
F Yue
FA Cubillos
FW Albert
G Hemani
G Nicholson
GD Smith
GH Golub
H Foroughi Asl
H Talukdar
HN Kadarmideen
J Millstein
J Qi
J Zhu
J Zhu
J Zhu
JE Aten
JF Ayroles
JJ Faith
JL Björkegren
JS Liu
K Basso
K Qu
KG Ardlie
L Wu
LA Hindorff
LH Hartwell
LS Chen
M Ashburner
M Civelek
M Georges
M Gerstein
M Medvedovic
M Schmidt
M Scutari
MA Schaub
MB Eisen
MD Ritchie
ME Goddard
MEJ Newman
MEJ Newman
MV Rockman
MV Rockman
N Friedman
N Friedman
N Friedman
N Laird
O Stegle
P Langfelder
P Langfelder
P Langfelder
P Lu
R Sharan
R Sharan
RB Brem
RW Williams
S Lee
S Roy
S Tavazoie
SI Lee
SM Waszak
SS Rao
T Lappalainen
T Michoel
TA Manolio
TF Mackay
The ENCODE
TS Furey
VG Cheung
W Cookson
W Zhang
Y Chen
Y Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2016
Field of study

Genetic differences between individuals associated to quantitative phenotypic traits, including disease states, are usually found in non-coding genomic regions. These genetic variants are often also associated to differences in expression levels of nearby genes (they are "expression quantitative trait loci" or eQTLs for short) and presumably play a gene regulatory role, affecting the status of molecular networks of interacting genes, proteins and metabolites. Computational systems biology approaches to reconstruct causal gene networks from large-scale omics data have therefore become essential to understand the structure of networks controlled by eQTLs together with other regulatory genes, and to generate detailed hypotheses about the molecular mechanisms that lead from genotype to phenotype. Here we review the main analytical methods and softwares to identify eQTLs and their associated genes, to reconstruct co-expression networks and modules, to reconstruct causal Bayesian gene and module networks, and to validate predicted networks in silico.Comment: minor revision with typos corrected; review article; 24 pages, 2 figure

arXiv.org e-Print Archive

Crossref